- 
                Notifications
    You must be signed in to change notification settings 
- Fork 278
Add new Algorithms using explicit batch type #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
a2cc837    to
    2be89a0      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! The implementation is really neat. I have some questions regarding the names of the functions as you can see below.
Regarding the failure in the tests, I think you have to include xsimd_fallback.hpp in the cpp file, so that the compiler can find the default implementation when a type is not supported by the current instructions set.
        
          
                include/xsimd/stl/algorithms.hpp
              
                Outdated
          
        
      | template <class I1, class I2, class O1, class UF> | ||
| void transform(I1 first, I2 last, O1 out_first, UF&& f) | ||
| template <class I1, class I2, class O1, class UF, class UFB> | ||
| void transform_batch(I1 first, I2 last, O1 out_first, UF&& f, UFB&& fb) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keeping transform as the function name?
5459d0a    to
    983a7c4      
    Compare
  
            
          
                test/test_algorithms.cpp
              
                Outdated
          
        
      |  | ||
| TEST(algorithms, reduce_batch) | ||
| { | ||
| const double nan = std::numeric_limits<double>::quiet_NaN(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ARM, vectorization for double is available only on 64bits arch. Therefore, this test should be guarded with something like
#if XSIMD_ARM_INSTR_SET >= XSIMD_ARM8_64_NEON_VERSION || XSIMD_X86_INSTR_SET >= XSIMD_X86_SSE2_VERSION
        
          
                include/xsimd/stl/algorithms.hpp
              
                Outdated
          
        
      | using enable_if_increment = typename std::enable_if<has_increment<T>::value>::type; | ||
|  | ||
| template <class T> | ||
| using enable_if_not_increment = typename std::enable_if<!has_increment<T>::value>::type; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to move these metafunctions in some detail namespace, they're not supposed to be part of the API.
        
          
                include/xsimd/stl/algorithms.hpp
              
                Outdated
          
        
      | typename = enable_if_increment<I2>, | ||
| typename = enable_if_increment<O1>, | ||
| typename = enable_if_not_increment<UF>, | ||
| typename = enable_if_not_increment<UFB>> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more readable to gather these conditions so that you can use a single enable_if condition. That could be something like:
template <class... Args>
struct have_increment : all_true<has_increment<Args>::value...> {};
template <class... Args>
struct not_have_increment : all_true<!has_increment<Args>::value...> {};
template <class I1, class I2, class I3, class UF, class UFB>
using enable_binary_algorithm_t = typename std::enable_if<have_increment<I1, I2, O1>::value && not_have_increment<UF, UFB>::value, int>::type;Besides, default template parameters are not considered by the compiler for overload selection, so it's better to use the enable_if as the template parameter evaluating to an int when it's valid:
template <class I1, class I2, class O1, class UF, class UFB,
                 enable_binary_algorithm_t<I1, I2, O1, UF, UFB> = 0>
...983a7c4    to
    0f40c17      
    Compare
  
    0f40c17    to
    8a81e7c      
    Compare
  
    6c6dc1f    to
    52984ef      
    Compare
  
    
New algorithms are added:
Tests are updated.
README.md updated.
Using the added algorithms I've create a
nanmean_fastfunction where the benchmark are the faster respect other implementation:BM_nanmeanhas a classic C style whileBM_nanmean_with_xtensoris essentiallyxt::mean(xt::filter(e, !xt::isnan(e)))whereeis a tensorBM_nanmean_xtis thext::nanmeanBM_nanmean_fastuse the addedreduce_batchandcount_ifalgorithms